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Description 

Technical field ..^ 

This invention relates to a nnethod and nneans for microbiai polypeptide expression. 

5 

Background ..... //ii^MA^/ // f/\ 

Genetic information is encoded on double-stranded deoxyribonucleic acid ( DNA or genes } 

according to the order in which the DNA coding strand presents the characteristic bases of its repeating 

nucleotide components. "Expression" of the encoded information to form polypeptides involves a two-part 

ro process. According to the dictates of certain control regions ("regulons") in the gene. RNA polymerase may 
be caused to move along the coding strand, forming messenger RNA (ribonucleic acid) in a process called 
"transcription." In a subsequent "translation" step the cell's ribosomes in conjunction with transfer RAN 
convert the ahRNA "message" into polypeptides. Included in the information /7?RNA transcribes from DNA 
are signals for the start and termination of ribosomal translation, as well as the identity and sequence of the 

15 amino acids which make up the polypeptide. The DNA coding strand comprises long sequences of 
nucleotide triplets called "codons" because the characteristic bases of the nucleotides in each triplet or 
codon encode specific bits of information. For example, 3 nucleotides read as ATG (adenine-thymine- 
guanine) result in an mRNA signal interpreted as "start translation", while termination codons TAG, TAA 
and TGA are interpreted "stop translation". Between the start and stop codons lie the so-called structural 

20 gene, whose codons define the amino acid sequence ultimately translated. That definition proceeds 
according to the well-established "genetic code" (e.g., J. D. Watson, Molecular Biology of the Gene W. A. 
Benjamin Inc., N. Y., 3rd ed, 1976) which describes the codons for the various amino acids. The genetic 
code is degenerate in the sense that different codons may yield the same amino acid, but precise in that for 
each amino acid there are one or more codons for it and no other. Thus, for example, all of the codons TCT, 

25 TCC, TCA, TCG, AGT and AGC and, when read as such, encode for serine and no other amino acid. During 
translation the proper reading phase or reading frame must be maintained. Consider for example what 
happens when the ribosome reads different bases as the beginning of a codon (underlined) in the 
sequence. . .GCTGGTTGTAAG. . . : 



30 



. . -GCT GGT TGT AAG . . .-^ . . ,Aia-Gly-Cys-Lys. . 
. . .G CTG GTT GTA AG. ..->.. .Leu-Val-Leu-Val. 
. . .GC TGG TTG TAA G. . .-^ . . .Trp-Leu-{STOP). 
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The polypeptide ultimately produced, then, depends vitally upon the spatial relationship of the structural 

gene with respect to the regulon. 

A clearer understanding of the process of genetic expression will emerge once certain components of 

genes are defined: 

40 Operon— A gene comprising structural gene(s) for polypeptide expression and the control region 
("regulon") which regulates that expression. 

Promoter— A gene within the regulon to which RNA polymerase must bind for initiation of 

transcription. 

Operator— A gene to which repressor protein may bind, thus preventing RNA polymerase binding on 

46 the adjacent promoter. 

Inducer— A substance which deactivates repressor protein, freeing the operator and permitting RNA 
polymerase to bind to promoter and commence transcription. 

Cataboiite activator protein ("CAP") binding site— A gene which binds cyclic adenosine 
monophosphate ("c AMP")— mediated CAP, also commonly required for initiation of transcription. The 
50 CAP binding site may in particular cases be unnecessary. For example, a promoter mutation in the lactose 
operon of the phage \ plac L1V5 eliminates the requirement for cAMP and CAP expression. J. Beckwith et al, 
J. MoL Biol 69, 155—160 (1972). 

Promoter-operator system— As used herein, an operable control region of an operon, with or without 
respect to its inclusion of a CAP binding site or capacity to code for repressor protein expression, 
55 Further by way of definition, and for use in the discussion of recombinant DNA which follows, we 
define the following: 

Cloning vehicle— Non-chromosomal double stranded DNA comprising an intact "replicon such that 
the vehicle is replicated, when placed within a unicellular organism ("microbe") by a process of 
"transformation". An organism so transformed is called a "transformant". 

Plasmid— For present purposes, a cloning vehicle derived from viruses or bacteria, the latter being 
"bacterial plasmids," 

Complementarity— A property conferred by the base sequences of single strand DNA which permits 
the formation of double stranded DNA through hydrogen bonding between complementary bases on the 
respective strands. Adenine (A) complements thymine (T), while guanine (G) complements cytosine (C). 
Advances in biochemistry in recent years have led to the construction of "recombinant" cloning 
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vehicles in which, for example, plasmids are made to contain exogenous DNA. In particular instances the 
recombinant may include "heterologous" DNA, by which is meant DNA that codes for polypeptides 
ordinarily not produced by the organism susceptible to transformation by the recombinant vehicle. Thus, 
plasmids are cleaved to provide linear DNA having ligatable termini. These are bound to an exogenous 

5 gene having ligatabie termini to provide a biologically functional moiety with an intact replicon and a 
desired phenotypical property. The recombinant moiety is inserted into a microorganism by 
transformation and transformants are isolated and cloned, with the object of obtaining large populations 
capable of expressing the new genetic information. Methods and means of forming recombinant cloning 
vehicles and transforming organisms with them have been widely reported in the literature. See, e.g., H. L 

JO Heynecker et al, Nature 263, 748--752 (1976); Cohen et al, Proc. Nat Acad. Scf. USA 69, 21 10 (1972); fb/dr 
70, 1293 (1973); ibid., 70, 3240 (1973); ibid,, 71, 1030 (1974); Morrow et ah, Proc. Nat Acad. Sci. U.S.A. 71, 
1743 (1974); Novick, Bacteriologicai Rev., 33, 210 (1969); Hershfield et al, Proc, Natl Acad. Sci. U.S.A. 71, 
3455 (1974) and Jackson et al. Ibid, 63, 2904 (1972). A generalized discussion of the subject appears in S. 
Cohen, Scientific American 233, 24 (1975). These and other publications alluded to herein are incorporated 

15 by reference. 

A variety of techniques are available for DNA recombination, according to which adjoining ends of 
separate DNA fragments are tailored in one way or another to facilitate ligation. The latter term refers to the 
formation of phosphodiester bonds between adjoining nucleotides, most often through the agency of the 
enzyme T4 DNA ligase. Thus, blunt ends may be directly ligated. Alternatively, fragments containing 

20 complementary singie strands at their adjoining ends are advantaged by hydrogen bonding which 
positions the respective ends for subsequent ligation. Such single strands, referred to as cohesive termini, 
may be formed by the addition of nucleotides to blunt ends using terminal transferase, and sometimes 
simply by chewing back one strand of a blunt end with an enzyme such \-exonuclease. Again, and most 
commonly, resort may be had to restriction endonucleases, which cleave phosphodiester bonds in and 

25 around unique sequences of nucleotides of about 4 — 6 base pairs in length. Many restriction 
. endonucleases Bnd their recognition sites are known, the so-called Eco Ri endonuclease being most widely 
employed. Restriction endonucleases which cleave double-stranded DNA at rotationally symmetric 
"palindromes" leave cohesive termini. Thus, a plasmid or other cloning vehicle may be cleaved, leaving 
termini each comprising half the restriction endonuclease recognition site. A cleavage product of 

30 exogenous DNA obtained with the same restriction endonuclease will have ends complementary to those 
of the plasmid termini. Alternatively, as disclosed infra, synthetic DNA comprising cohesive termini may be 
provided for insertion into the cleaved vehicle. To discourage rejoinder of the vehicles' cohesive termini 
pending insertion of exogenous DNA, the termini can be digested with alkaline phosphatase, providing 
molecular selection for closures incorporating the exogenous fragment. Incorporation of a fragment 

^5 having the proper orientation relative to other aspects of the vehicle may be enhanced when the fragment 
supplants vehie DNA excised by two different restriction endonucleases, and itself comprises termini 
respectively constituting half the recognition sequence of the different endonucleases. 

Despite wide-ranging work in recent years in recombinant DNA research, few results susceptible to 
immediate and practical application have emerged. This has proven especially so in the case of failed 

40 attempts to express polypeptides and the like coded for by "synthetic DNA", whether constructed 
nucleotide by nucleotide in the conventional fashion or obtained by reverse transcription from isolated 
mRNA (complementary or "cDNA"). In this application we describe what appears to represent the first 
expression of a functional polypeptide product from a synthetic gene, together with related developments 
which promise wide-spread application. The product referred to is somatostatin (Guiilemin U.S. P. 

45 3,904,594), an inhibitor of the secretion of growth hormone, insulin and glucagon whose effects suggest its 
application in the treatment of acromagaly, acute pancreatitis and insulin-dependent diabetes. See R. 
Guiilemin et ai, Annuai Rev. Med. 27 379 (1976). The somatostatin model clearly demonstrates the 
applicability of the new developments described here on numerous and beneficial fronts, as will appear 
from the accompanying drawings and more clearly from the detailed description which follows. 

50 

Summary of invention 

According to the invention there is provided a recombinant plasmid suited for transformation of a 
bacterial host wherein the plasmid comprises a homologous regulon, heterologous DNA, and one or more 
termination codon(s), the heterologous DNA encoding a desired functional heterologous polypeptide or 

55 intermediate therefor which is not degraded by endogenous proteolytic enzymes, said DNA being 
positioned in proper reading frame with said homologous regulon between said regulon and the 
termination codon(s), whereby on translation of the transcription product of the heterologous DNA in a 
suitable bacterium, the resulting expression product is said desired functional polypeptide or intermediate 
therefor in recoverable form. 

60 According to a further aspect of the invention, there is provided a process for the production of a 
recombinant plasmid as defined above which comprises treating a length of double stranded DNA 
comprising an Intact replicon and in sequence, (a) a regulon for controlling transcription and translation in 
a bacteria! host and (b) a restriction endonuclease recognition site, with a suitable restriction endonuclease 
to form a DNA fragment that comprises the replicon and the regulon, and ligating thereto in proper reading 

65 frame with said regulon heterologous DNA encoding a functional heterologous polypeptide or 
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intermediate therefor which is not degraded by endogenous proteolytic enzynnes, said heterologous DNA 
having a terminal nucleotide grouping which is ligatable to said DNA fragment, to give said recombinant 
plasmid; and bacteria transformed with said plasmids. 

The invention also provides a process for the production of a functional polypeptide in the form of an 
immunogenic substance, comprising: 

(a) providing a recombinant plasmid containing a homologous regulon, and in proper reading frame 
therewith, a heterologous DNA sequence encoding the hapten, a DNA sequence encoding a second amino 
acid sequence sufficient in size to render the product of DNA expression immunogenic and one or more 
termination codons; 

w (b) growing a bacterium transformed with the recombinant plasmid, occasioning expression of a 
conjugate polypeptide consisting essentially of the amino acid sequence of the hapten and the second 
amino acid sequence; and 

(c) testing the conjugate polypeptide for its ability to raise antibodies against said hapten. 

;5 Brief description of the drawings 

The accompanying drawings illustrate one context in which preferred embodiments of the invention 
find application, i.e., expression of the hormone somatostatin by bacterial transformants containing 
recombinant plasmids. 

Figure 1. Schematic outline of the process: the gene for somatostatin, made by chemical DNA 
2Q synthesis, is fused to the £ coll 3-galactosidase gene on the plasmid pBR322. After transformation into E 

CO//, the recombinant plasmid directs the synthesis of a precursor protein which can be specifically cleaved 

In vitro at methionine residues by cyanogen bromide to yield active mammalian polypeptide hormone. A, 

T, C and G denote the characteristic bases (respectively adenine, thymine, cytosine and guanine) of the 

deoxyribonucleotides in the coding strand of the somatostatin gene. 
25 Figure 2. Schematic structure of a synthetic gene whose coding strand (i.e., the "upper" strand) 

comprises codons for the amino acid sequence of somatostatin (given). 

Figure 3. Schematic illustration of preferred method for construction of nucleotide trimers used in 

constructing synthetic genes. In the conventional notation employed to depict nucleotides in Figure 3, the 

5' OH is to the left and the 3' OH to the right e.g. 
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Figure 4. Flow chart for the construction of a recombinant plasmid (e.g., pS0M11-3) capable of 
expressing a somatostatin ("SOM")- containing protein, beginning with the parental plasmid pBR322, In 
Figure 4 the approximate molecular weight of each plasmid is stated in daltons ("d"). Ap' and Tc*" 

40 respectively denote genes for ampiciilin and tetracycline resistance, while Tc^ denotes tetracycline 
susceptibility resulting from excision of a portion of the Tc' gene. The relative positions of various 
restriction endonuclease specific cleavage sites on the plasmids are depicted (e.g., Eco Rl, Bam I, etc.). 

Figures 5A and 5B. The nucleotide sequences of key portions of two plasmids are depicted, as is the 
direction of messenger RNA ("mRNA") transcription, which invariably proceeds from the 5' end of the 

45 coding strand. Restriction endonuclease substrate sites are as shown. Each depicted sequence contains 
both the control elements of the lac (lactose) operon, and codons for expression of the amino acid 
sequence of somatostatin (italics). The amino acid sequence numbers the p-galactosldase ("p-gal") are in 
brackets. 

Figures 6—8. As more particularly described in the "Experimental" discussion, infra, these depict the 
50 results of comparative radioimmune assay experiments which demonstrate the somatostatin activity of 
product expressed by the recombinant plasmids. 

Figure 9. Schematic structure of synthetic genes whose coding strands comprise codons for the amino 
acid sequences of the A and B strands of human insulin. 

Figure 10. Flow chart for construction of a recombinant plasmid capable of expressing the B chain of 

55 human insulin. 



Detailed description 

1. Preparation of genes coding for heterologous polypeptide 

DNA coding for any polypeptide of known amino acid sequence may be prepared by choosing codons 
according to the genetic code. For ease In purification, etc., oligodeoxyribonucleotide fragments of, for 
example, from about 11 to about 16 nucleotides are prepared separately, then assembled in the desired 
sequence. Thus, one prepares first and second series of oligodeoxyribonucleotide fragments of convenient 
size. The first series, when joined in proper sequence, yield a DNA coding strand for polypeptide 
expression (see, e.g.. Figure 2, fragments A, B, C and D). The second series, when likewise joined In proper 
sequence, yield a strand complementary to the coding strand (e.g.. Figure 2, fragments E, F, G and H). The 
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fragments of the respective strands preferably overlap such that complementarity promotes their self 
assembly through hydrogen bonding of the cohesive termini of fragment blocks. Following assembly, the 
structural gene is completed by ligation in the conventional manner. 

The degeneracy of the genetic code permits substantial freedom in the choice of codons for any given 

5 amino acid sequence. For present purposes, however, codon choice was advantageously guided by three 
considerations. First, codons and fragments were selected, and fragment assembly was staged, so as to 
avoid undue complementarity of the fragments, one with another, save for fragments adjacent one another 
in the intended gene. Secondly, sequences rich in AT base pairs (e.g., about five or more) are avoided, 
particularly when preceded by a sequence rich in GC base pairs, to avoid premature termination of 

10 transcription. Thirdly, at least a majority of the codons chosen are those preferred in the expression of 
microbial genomes (see, e.g., W. Fiers, et ai, Nature 260^ 500 (1976)). For purposes of the appended claims, 
we define the following as codons "preferred for the expression of microbial genomes": 

TABLE I 

J5 Preferred assignment of codons 
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First position 




Second position 




Third position 


(5' end) 




(read across) 




(3' end) 


(read down) 
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(read down) 




phe 






cys 
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phe 


ser 


tyr 
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leu 




Stop 


Stop 


A 






ser 


Stop 


trp 


G 




leu 


pro 


his 


arg 


T 
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pro 


his 


arg 
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pro 


gin 
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C 




pro 


gin 
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ile 


thr 


asn 




T 




lie 


thr 


asn 


ser 


C 
A 
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met{start) 


thr 


lys 




G 




val 


ala 


asp 


gly 


T 




vai 




asp 




C 




val 




glu 
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G 


val 


ala 


glu 




G 



Most preferably in the case of somatostatin, the amino acid (codon) relationships of the struc tural gene 
are: gly (GGT); cys (TGT); lys (AAG); trp (TGG); ala (GCT, GCG); asn (AAT, AAC); phe (TTC, TTT); thr (ACT, 
ACG); and ser (TCC, TCG). 

45 Where the structural gene of a desired polypeptide is to be inserted in a cloning vehicle for expression 
as such, the gene is preceded by a "start" codon (e.g., ATG) and immediately followed by one or more 
termination or stop codons (see Fig. 2). However, as described infra, the amino acid sequence of a 
particular polypeptide may be expressed with additional protein preceding and/or following it. If the 
intended use of the polypeptide requires cleavage of the additional protein, appropriate cleavage sites are 

50 coded for adjacent the polypeptide — additional protein codon junction. Thus, in Figure J as an example, the 
expression product is a precursor protein comprising both somatostatin and the greatest part of the 
P-galactosidase polypeptide. Here ATG is not required to code for the start of translation because 
ribosomal translation of the addition p-gal protein reads through into the somatostatin structural gene, 
incorporation of the ATG signal, however, codes for the production of methionine, an amino acid 

55 specifically cieaved by cyanogen bromide, affording a facile method for converting precursor protein into 
the desired polypeptide. 

Figure 2 also exemplifies a further feature preferred in heterologous DNA intended for recombinant 
employment, i.e., the provision of cohesive termini, preferably comprising one of the two strands of a 
restriction endonuclease recognition site. For reasons previously discussed, the termini are preferably 
60 designed to create respectively different recognition sites upon recombination. 

While the developments described here have been demonstrated as successful with the somatostatin 
model, it will be appreciated that heterologous DNA coding for virtually any known amino acid sequence 
may be employed, mutatis mutandis. Thus, the techniques previously and hereafter discussed are 
applicable, mutatis mutandis, to the production of poly(amino)acids, such as polyleucine and polyalanine; 
65 enzymes; serum proteins; analgesic polypeptides, such as p-endorphins, which modulate thresholds of 
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pain, etc. Most preferably, the polypeptides produced as such will be mammalian hormones or 
intermediates therefor. Among such hormones may be mentioned, e.g., somatostatin, human, insulin, 
human and bovine growth hormone, luteinizing hormone, ACTH, pancreatic polypeptide, etc. 
Intermediates include, for example, human preproinsuiin, human proinsulin, the A and B chains of human 
5 insulin and so on. In addition to DNA made in vitro, the heterologous DNA may comprise cDNA resulting 
from reverse transcription from /r7RNA. See, e.g., Ullrich et al, Science 196, 1313 (1977). 

2. Recombinants coding for the expression of precursor protein 

In the process schematically depicted in Figure 1, expression yields a precursor protein compnsmg 
10 both a polypeptide coded for by a specific heterologous structural gene (somatostatin) and additional 
protein (comprising a portion of the (3-galactosidase enzyme). A selective cleavage site adjacent the 
somatostatin amino acid sequence permits subsequent separation of the desired polypeptide from 
superfluous protein. The case illustrated is representative of a large class of procedures made available by 

the techniques described herein. ^ i_ r^, 

15 Most commonly, cleavage will be effected outside the replicative environment of the Plasmid or other 
vehicle as, for example, following harvest of the microbial culture. In this fashion temporary conjugation of 
small polypeptides with superfluous protein may preserve the form against, e.g., in vivo degradation by 
endogenous enzymes. At the same time, the additional protein will ordinarily rob the desired polypeptide 
of bioactivity pending extra-cellular cleavage, with the effect of enhancing the biosafety of the procedure. In 

20 particular instances, of course, it may prove desirable to effect cleavage within the cell. For example, 
cloning vehicles could be provided with DNA coding for enzymes which convert insulin precursors to the 
active form, operating in tandem with other DNA coding for expression of the precursor form. 

In the preferred case, the particular polypeptide desired lacks internal cleavage sites corresponding to 
that employed to shed superfluous protein, although it will be appreciated that where that condition is not 

25 satisfied competition reactions will yet give the desired product, albeit in lower yield. Where the desired 
product is methionine-free, cyanogen bromide cleavage at methionine adjacent the desired sequence has^ 
proven highly effective. Likewise, arginine- and lysine-free products may be enzymatically cleaved with, 
e.g., trypsin or chymotrypsin at arg-arg, iys-lys or like cleavage sites adjacent the desired sequence, in the 
case where cleavage leaves, e.g., unwanted arginine attached to desired product, it may be removed by 

30 carboxypeptidase digestion. When trypsin is employed to cleave at arg-arg, lysine sites within the desired 
polypeptide may first be protected, as with maleic or citraconic anhydrides. The cleavage techniques 
discussed here by way of example are but representative of the many variants which will occur to the 
art-skilled in light of the specification. . ■ 

Cleavable protein may be expressed adjacent either the C- or N-terminals of a specific polypeptide, or 

35 even within the polypeptide itself, as in the case of the included sequence which distinguishes proinsulin 
and insulin. Again, the vehicle employed may code for expression of protein comprising repeated 
sequences of the desired polypeptide, each separated by selective cleavage sites. Most preferably, 
however, codonsfor superfluous protein will be translated in advance of the structural gene of the desired 
product, as in the case illustrated in the Figures. In every case care should be taken to maintain the proper 

40 reading frame relative to the regulon. 

3. Expression of immunogens . ._i x i r *u 

The ability to express both a specific polypeptide and superfluous protein provides useful tools for the 
production of immunogenic substances. Polypeptide ''haptens" (i.e. substances containing determinants 
specifically bound by antibodies and the like but ordinarily too small to elicit an immune response) can be 
expressed as conjugates with additional protein sufficient in size to confer immunogenicity. Indeed, the 
p-gal-somatostatin conjugate produced here by way of example is of immunogenic size and may be 
expected to raise antibodies which bind the somatostatin hapten. Proteins comprising in excess of 100 
amino acids, most commonly in excess of 200 such, exhibit Immunogenic character. 

Conjugates prepared in the foregoing fashion may be employed to raise antibodies useful in 
radioimmune or other assays for the hapten, and alternatively in the production of vaccines. We next 
describe an example of the latter application. Cyanogen bromide— or other cleavage products of viral coat 
protein will yield oligopeptides which bind to antibody raised to the protein itself. Given the ammo acid 
sequence of such an oligopeptide hapten, heterologous DNA therefore may be expressed as a conjugate 
with additional protein which confers immunogenicity. Use of such conjugates as vaccines could be 
expected to diminish side reactions which accompany use of coat protein itself to confer immunity. 

4. The control elements ^ + x 
Figure 1 depicts a process wherein a transfromant organism expresses polypeptide product trom 

heterologous DNA brought under the control of a regulon "homologous" to the organism in its 

untransformed state. Thus, lactose-dependent E. Coli, chromosomal DNA comprises a lactose or lac 

operon which mediates lactose digestion by, inter alia, elaborating the enzyme p-galactosidase. In the 

particular instance illustrated, the lac control elements are obtained from a bactenophage, X plac 5, which is 

infective for the £ Coli, The phage's lac operon, in turn, was derived by transduction from the same 
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bacterial species, hence the "homology". Homologous regulons suitable for use In the disclosed process 
may alternatively derive from piasmidic DNA native to the organism. 

The simplicity and efficiency of the lac promoter-operator systems commend its use In the systems we 
describe, as does its ability to be induced by iPTG (isopropylthlo-P-D gatactoside). Of course, other operons 
5 or portions thereof could be employed as well, e.g., lambda promoter-operator, arabinose operon (phi 80 
</ara), or the colicine El, galactose, alkaline phosphatase or tryptophan operons. Promoter-operators 
derived from the latter (i.e., "tryp operon") would be expected to confer 100% repression pending 
induction (with indoleacryllc acid) and harvest. 

10 5. Plasmid construction generally 

The details of the process schematically illustrated in Figure 4 appear from the Experimental section. 
Infra. At this point, however, it is useful to briefly discuss various of the techniques employed in 
constructing the recombinant plasmid of the preferred embodiment. 

The cloning and expression of the synthetic somatostatin gene employed two plasmlds. Each plasmid 
15 has an £coRI substrate site at a different region of the p-galactosidase structural gene (see Figures 4 and 5). 
The Insertion of the synthetic somatostatin DNA fragment into the £coB\ sites of these piasmids brings the 
expression of the genetic information in thatfragment under control of the lac operon controlling elements. 
Following the insertion of the somatostatin fragment into these piasmids, translation should result in a 
somatostatin polypeptide preceded either by 10 amino acid (pSOMI) or by virtually the whole 

20 p-galactosldase subunit structure (pSOMII — 3). 

The plasmid construction scheme Initiates with plasmid pBR322, a well-characterized cloning vehicle. 
Introduction of the lac elements to this plasmid was accomplished by insertion of a Hae\\\ restriction 
endonuclease fragment (203 nucleotides) carrying the lac promoter, CAP binding site, operator, ribosome 
binding site, and the first 7 amino acid codons of the P-galactosidase structural gene. The Hae\\\ fragment 

25 was derived from \plac5 DNA. The £coRI-cleaved PBR322 plasmid, which had its terminal repaired with T4 
DNA polymerase and deoxyribonucleotide triphosphates, was blunt-end ligated to the Hae\\\ fragment to 
create EcoRt termini at the insertion points. Joining of these //sell I and repaired £coRI termini generate the 
EcoRI restriction site (see Fig, 4 and 5) are each terminus. Transformants of E Co// RR1 with this DNA were 
selected for resistance to tetracycline, (Tc) and ampicillin (Ap) on 5-bromo-4-chioro-indolyi-galactoside 

30 (X-gal) medium. On this indicator medium, colonies constitutive for the synthesis of p-galactosidase, by 
virtue of the increased number of lac operators titrating repressor, are identified by their blue color. Two 
orientations of the/yaelll fragment are possible but these were distinguished by the asymmetric location of 
an Hha restriction site in the fragment. Plasmid pBHIO was further modified to eliminate the EcoRI 
endonuclease site distal to the lac operator (pBH20). 

35 The eight chemically synthesized oligodeoxyribonucleotides (Fig. 2) were labeled at the 5' termini with 
P^p]- -ATP by polynucelotide kinase and joined with T4 DNA ligase. Through hydrogen bonding between 
the overlapping fragments, the somatostatin gene self-assembles and eventually polymerizes into larger 
molecules because of the cohesive restriction site termini. The ligated products were treated with EcoRI 
and Bam)r\\ restriction endonucleases to generate the somatostatin gene as depicted in Figure 2. 

40 The synthetic somatostatin gene fragment with EcoRI and Bam\\\ termini was ligated to the pBH20 
plasmid, previously treated with the EcoRI and BamH\ restriction endonucleases and alkaline phosphatase. 
The treatment with alkaline phosphatase provides a molecular selection for plasmldes carrying the inserted 
fragment. Amplcillin-resistant transformants obtained with this ligated DNA were screened for tetracycline 
sensitivity and several were examined for the insertion of an EcoRI-5a/7?Hi fragment of the appropriate size. 

45 Both strands of the EcoR\'BamH\ fragments of piasmids from two clones were analyzed by nucleotide 
sequence analysis starting from the BamH\ and EcoRI sites. The sequence analysis was extended into the 
lac controlling elements; the lac fragment sequence was intact, and in one case, pSOMI, the nucleotide 
sequence of both strands were independently determined each giving the sequence depicted in Figure 5A. 
The Eco Rl-Ps? fragment of the pSOMI plasmid, with the lac-controlling element, was removed and 

50 replaced with the EcoRl-P5f fragment of pBR322 to produce the plasmid pS0M1 1. The EcoRI fragment of X 
plac 5, carrying the lac operon control region and most of the P-galactosidase structural gene, was inserted 
into the EcoR! site of pSOMII. Two orientations of the EcoRI lac fragment of Xplac 5 were expected. One of 
these orientations would maintain the proper reading frame into the somatostatin gene, the other would 
not. Analysis of independently isolated clones for somatostatin activity then identified clones containing 

55 the properly oriented gene, of which the clone designated pS0M11-3 was one. 

6, The microorganism 

Various unicellular microorganisms have been proposed as candidates for transformation, such as 
bacteria, fungil and algae. That is, those unicellular organisms which are capable of being grown In cultures 
QQ or fermentation. Bacteria areforthe most part the most convenient organisms to work with. Bacteria which 
are susceptible to transformation include members of the Enterobacteriaceae, such as strains of 
Escherichia coli and Salmonella; Bacillaceae, such as Bacillus subtlllls; Pneumococcus; Streptococcus, and 
Haemophilus influenzae. 

The particular organism chosen for the somatostatin work next discussed was £ Coli. strain RR1, 
f>s= genotype: Pro~Leu"ThrRB~MB rec A^ Str*" Lac y" £. Co//, RR1 is derived from E. Coll. HB101 (H. W. Boyer, et 
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al, J. Mol. BioL (1969) 41, 459—472) by mating with £. Co// K1 2 strain KL16 as the Hfr donor. See J. H. Miller, 
Experiments in Molecular Genetics (Cold Spring Harbor, New York, 1972). Cultures of both E. Coll RR1 and 
E. Coli, RR1 (pBR322) have been deposited with the American Type Culture Collection without restriction as 
to access, respectively ATCC Nos. 31343 and 31344 both deposited 8 Nov 1977. The somatostatin- 

5 producing organism has likewise been deposited [ATCC No. 31447], deposited 28 October 1978, 

In the case of human insulin, A and B chain genes were cloned in E Coli K-12 strain 294 (end A, thi~, 
hsr", hsmj,"), ATCC No. 31446, deposited 28th October 1978, and that organism employed in expression of 
the A chain (E. Coli K-12 strain 294 [piAl], ATCC No. 57445) deposited 28th October 1978. The B chain of 
human insulin was first expressed in a derivative of HB101, i.e., £ coli K-12 strain D1210 a lac"- (i^o-'zV"), 

10 and that B gene-containing organism has likewise been deposited (ATCC No. 31449) deposited 28th 
October 1978. Alternatively, the B gene may be inserted in and expressed from the organism first 
mentioned, i.e., strain 294. 

Experimental 

15 1 Somatostatin 

1. Construction of somatostatin gene fragments 

Eight oiigodeoxyribonucleotides respectively labeled A through H-in Figure 2 were first constructed, 
principally by the modified triester method of K. Itakura et al, J. Am, Chem, Soc. 97, 7327 (1975). However, 
in the case of fragments C, E and H resort was had to an improved technique in which fully protected 

20 trimers are first prepared as basic units for building longer oligodeoxyrlbonucelotides. The improved 
technique is schematically depicted in Figure 3, wherein B is thymine, N-benzoyiated adenine, 
N-benzoylated cytosine or N-isobutyruiated guanine. In brief, and with reference to Figure 3, with an excess 
of 1 (2 mmoie), the coupling reaction with II (1 mmole) went almost to completion in 60 min with the aid of a 
powerful coupling reagent, 2,4,6-triisopropylbenzenesulfonyi tetrazolide (TPSTe, 4 mmole; 2). After 

25 removal of the 5'- protecting group with 2% benzene sulfonic acid solution, the 5'-hydroxyl dimer V could 
be separated from an excess of 3'-phosphodiester monomer IV by simple solvent extraction with aqueous 
NaHCOa solution in CHCI3. The fully protected trimer block was prepared successively from the 5'-hydroxyl 
dimer V, 1 (2 mmole), an TPSTe (4 mmole) and isolated by chromatography on silica gel, as in B.T. Hunt et 
al, Chem. and Ind' 1967, 1868 (1967). The yields of trimers made according to the improved technique 

30 appear from Table II. 

TABLE II 



Yields of fully protected trimers 





Sequence 


Yield 


Sequence 


Yield 




1 1 1 


81% 


ATG 


69% 


40 


TTT 


75% 


GCC 


61% 




GGA 


41% 


CCA 


72% 




AGA 


49% 


CAA 


72% 


45 


ATC 


71% 


TTA 


71% 





CCT 


61% 


CAT 


52% 


50 


ACA 


63% 


CCC 


73% 




ACC 


65% 


AAC 


59% 




CGT 


51% 


GAT 


60% 



The eight oiigodeoxyribonucleotides, after removal of all protecting groups, were purified by 
high-pressure liquid chromatography on Permaphase AAX (R. A. Henry et ai J. Chrom. ScL II, 358 (1973)). 
The purity of each oligomer was checked by homochromatography on thin-layer DEAE-cellulose and also 
by gel electrophoresis in 20% acrylamide slab after labeling of the oligomers with [y-^^pl-ATP in the 
presence of polynucleotide kinase. One major labeled product was obtained from each DNA fragment. 

2. Ligation and acrylamide gel analysis of somatostatin DNA 

The 5' OH termini of the chemically synthesized fragments A through H were separately 
phosphoryiated with T4 polynulceotide kinase. [^^pJ-y-ATP was used in phosphorylation so that reaction 
products could be monitored autoradiographically, although it will be appreciated that unlabelied ATP 
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would serve as weii were autoradiography dispensed with. Just prior to the kinase reaction, 25 Ci of 
[y-^^pIATP (approx. 1500 Ci/mMol) (IVlaxam and Gilbert, Proc. Nat Acad. ScL U,S,A. 74, 1507 (1977) was 
evaporated to dryness in 0.5 ml Eppendorf tubes. Five micrograms of fragment were incubated with 2 units 
of T4 DNA kinase (hydroxy la patite fraction, 2500 units/ml; 27), in 70 mM Tris-HCI pH 7.6, 10 mM ^/lgCI^ 5 
mM dithiothreitoi in a total volume of 1 50 [ii for 20 min at 37°C. To insure maximum phosphorylation of the 
fragments for ligation purposes, 10 iii of a mixture consisting of 70 mM Tris-HCI pH 7.6, 10 mM MgCIa, 5 
mM dithiothreitoi, 0.5 mM ATP and two units of DNA kinase were added and incubation continued for an 
additional 20 min at 7*'C. The fragments {250 ng/|jl) were stored at -20°C without further treatment. Kinased 
fragments A, B, E, and F (1.25 iig each) were ligated in a total volume of 50 jil in 20 mM Tris-HCI pH 7.6, 10 
mM Mg CU, 10 mM dithiothreitoi, 0.5 mM ATP and 2 units of T4 DNA ligase (hydroxylapatite fraction, 400 
units/ml; 27), for 16 hr at 4X, Fragments C, D, G and H were ligated under similar conditions. Samples of 2 
|ii were removed for analysis by electrophoresis on a 10% poiyacrylamide gel followed by autoradiography 
(H. L Heyneker et ai. Nature 263, 748 (1976)) in which unreacted DNA fragments are represented by fast 
migrating material and wherein the monomeric form of the ligated fragments migrate with bromophenol 
blue dye (BPB). Some dimerizatlon also occurs by reason of the cohesive ends of the ligated fragments A, 
B, E and F, and of the ligated fragments C, D, G and H. These dimers represent the slowest migrating 
material, and may be cleaved by restriction endonuclease EcoR\ and fiamHI, respectively. 

The two half molecules (ligated A+B+E+F and ligated C+D+G+H) were joined by an additional 
ligation step carried out in a final volume of 150 pi at4°Cfor 16 hr. One microliter was removed for analysis. 
The reaction mixture was heated for 15 min at 65°C to inactivate the T4 DNA ligase. The heat treatment 
does not affect the migration pattern of the DNA mixture. Enough restriction endonuclease BamHX was 
added to the reaction mixture to cleave the multimeric forms of the somatostatin DNA in 30 min at 37°C. 
After the addition of NaCI to 100 mM, the DNA was digested with £coRI endonuclease. The restriction 
endonuclease digestions were terminated by phenoi-chloroform extraction of the DNA. The somatostatin 
DNA fragment was purified from unreacted and partially ligated DNA fragments by preparative 
electrophoresis on a 10% poiyacrylamide gel. The band containing the somatostatin DNA fragment was 
excised from the gel and the DNA was eluted by slicing the gel into small pieces and extracting the DNA 
with eiution buffer (0.5 M ammonium acetate, 10 mM MgClg, 0.1 mM EDTA, 0.1% SDS) overnight at 65°C, 
The DNA was precipitated with 2 volumes of ethanol, centrifuged, redissolved in 200 |iMO mM Tris-HCI pH 
7.6 and dialyzed against the same buffer resulting in a somatostatin DNA concentration of 4 pg/ml. 

3. Construction of recombinant plasmids 

Figure 4 schematically depicts the manner in which recombinant plasmids comprising the 
somatostatin gene were constructed, and may be referred to in connection with the following more 
particularized discussion. 

A. The parental plasmid pBR322 

The plasmid chosen for experimental somatostatin cloning was pBR322, a small (molecular wt. 
approx. 2.6 megadaltons) plasmid carrying resistance genes to the antibiotics ampiciHin (Ap) and 
tetracycline (Tc). As indicated in Figure 4, the ampiciliin resistance gene includes a cleavage site for the 
restriction endonuclease Pst I, the tetracycline resistance gene includes a similar site for restriction 
endonuclease BamH\, and an £coRI site is situated betwene the Ap"" and TC genes. The plasmid pBR322 is 
derived from pBR313, a 5.8 megadalton ApTc'Col'"''" plasmid (R. L Rodriquez et al, ICN-UCLA Symposia 
on Molecular and Cellular Biology 5, 471 — 77 (1976), R. L Rodriquez et al. Construction and 
Characterization of Cloning Vehicles, in Molecular Mechanisms In the Control of Gene Expression, pp. 
471-77, Academic Press, Inc. (1976). Plasmid pBR322 is characterized and the manner of its derivation fully 
described in F. Bolivar et al, "Construction and Characterization of New Cloning Vehicles II. A Multipurpose 
Cloning System", Gene (November 1977). 

B. Construction of plasmid pBHIO 

Five micrograms of plasmid pBR322 DNA was digested with 10 units of the restriction endonuclease 
EcoR\ in 100 mM Tris-HCI pH 7.6, 100 mM NaCI, 6 mM MgClj at 37°C for 30 min. The reaction was 
terminated by phenolchloroform extraction; the DNA was then precipitated with two and a half volumes of 
ethanol and resuspended in 50 pi of T4 DNA polymerase buffer (67 mM Tris-HCI pH 8.8, 6.7 mM MgCIa, 16.6 
mM (NH4)2S04, 167 pg/ml bovine serum albumin, 50 pM of each of the dNTP's; A. Panet et al, Biochem. 12, 
5045 (1973). The reaction was started by the addition of 2 units of T4 DNA polymerase. After incubation for 
30 min at 37° the reaction was terminated by a phenoi-chloroform extraction of the DNA followed by 
precipitation with ethanol. Three micrograms of Aplac5 DNA (Shapiro et al Nature 224, 768 (1969)) was 
digested for 1 hr at 37°C with the restriction enzyme Hae\\\ (3 units) in 6 mM Tris-HCI pH 7.6, 6 mM MgCIs, 6 
mM p-mercaptoethanol in a final volume of 20 pi. The reaction was stopped by heating for 10 min at 65°C. 
The pBR322 treated DNA was mixed with the Hae\\\ digested Xplac5 DNA and blunt-end ligated in a final 
volume of 30 pi with 1.2 units of T4 DNA ligase (hydroxylapatite fraction; A. Panet et al, supra) in 20 mM 
Tris-HCI pH 7.6, 10 mM MgClg, 10 mM dithiothreitoi, 0.5 mM ATP for 12 hrs at 12°C. The ligated DNA 
mixture was diaiyzed against 10 mM Tris-HCI pH 7.6, and used for transformation of f. co//' strain RR1 
conventionally. Transformants were selected for tetracycline and ampiciliin resistance on minimal 



10 



EP 0 001 929 B1 

medium, plates containing 40 pg/mi of 5-bromo-4-chloro-colyigaIactoside (X-gal) medium (J. H. Miller, 
Experiments in Molecular Genetics (Cold Spring Harbor, New York, 1972)). Colonies constitutive for the 
synthesis of p-galactosidase were identified by their blue color. After screening 45 independently isolated 
blue colonies, three of them were found to contain plasmid DNA carrying two EcoR\ sites separated by 
5 approximately 200 base pairs. The position of an asymmetrically located Hhal fragment in the 203 b,p. 
Hae\\\ lac control fragment (W. Gilbert et al, in Proteln-Ligand Interactions, H. Sand and G. Blauer, Eds. (De 
Gruyter, Berlin, (1975) pp. 193—210) allows for the determination of the orientation of the Hae\\\ fragment, 
now an £coR\ fragment, in these plasmids. Plasmid pBHIO was shown to carry the fragment in the desired 
orientation, i.e., lac transcription going into the To'' gene of the plasmid. 

70 

C. Construction of plasmid pBH20 

Piasmid pBH10 was next modified to eliminate the EcoR\ site distal to the lac operator. This was 
accomplished by preferential £coRI endonuclease cleavage at the distal site involving partial protection by 
RNA polymerase of the other £coRI site localized between the Tc' and lac promoters, which are only about 

75 40 base pairs apart. After binding RNA polymerase, the DNA (5 pg) was digested with EcoRl (1 unit) in a 
final volume of 10 lil for 10 min at 37°C. The reaction was stopped by heating at 65°C for 10 min. The EcoRI 
cohesive termini were digested with SI nuclease in 25 mM Na-acetate pH 4.5, 300 mM NaCI, 1 mM ZnCIa at 
25X for 5 min. The reaction mixture was stopped by the addition of EDTA {10 mM final) and Tris-HCI pH 8 
(50 mM final). The DNA was phenol-chloroform extracted, ethanol precipitated and resuspended in 100 \i\ 

20 of T4 DNA ligation buffer. T4 DNA ligase (1 [il) was added and the mixture incubated at 12°C for 12 hr. The 
ligated DNA was transformed in E co// strain RR1, conventionally, and Ap'Tc' transformants were selected 
on X-gal-antibiotic medium. Restriction enzyme analysis of DNA screened from 10 isolated blue colonies 
revealed that these clones carried plasmid DNA with one EcaRI site. Seven of these colonies had retained 
the £coR\ site located between the lac and Tc"* promotors. The nuceotide sequence from the £coRI site into 

25 the lac-control region of one of these plasmids, pBH20, was confirmed. This plasmid was next used to clone 
the somatostatin gene, 

D, Construction of plasmid pSOMI 

Twenty micrograms of the plasmid pBH20 was digested to completion with restriction endonucleases 

30 EcoB\ and 8amH\ in a final volume of 50 pi. Bacterial alkaline phosphatase was added (0.1 unit of 
Worthington BAPF) and incubation was continued for 10 min at 65°C. The reactions were terminated by 
phenol-chloroform extraction and the DNA was precipitated with 2 volumes of ethanol, centrifuged and 
dissolved in 50 [il 10 mM Tris-HCI pH 7.6, 1 mM EDTA. The alkaline phosphatase treatment effectively 
prevents self-ligation of the EcoRl BamHl treated pBH20 DNA, but circular recombinant plasmids 

35 containing somatostatin DNA can still be formed'upon ligation. Since £. co/i RR1 is transformed with very 
low efficiency by linear plasmid DNA, the majority of the transformants will contain recombinant plasmids. 
Fifty microliters of somatostatin DNA (4 [xgfm\) were ligated with 25 |il of the BamHl EcoRl alkaline' 
phosphatase-treated pBH20 DNA in a total volume of 50 p' containing 20 mM Tris-HCI pH 7.6, 10 mM 
MgCIa, 10 mM dithiothreitol, 0.5 mM ATP, and 4 units of T4 DNA ligase at 22°C. After 10, 20 and 30 min, 

40 additional somatostatin DNA (40 ng) was added to the reaction mixture (the gradual addition of 
somatostatin DNA may favor ligation to the plasmid over self-ligation). Ligation was continued for 1 hr 
followed by dialysis of the mixture against 10 mM Tris-HCI pH 7,6, In a control experiment, BamHl EcoRl 
alkaline phosphatase-treated pBH20 DNA was ligated in the absence of somatostatin DNA under similar 
conditions. Both preparations were used without further treatment to transfers £ co/i RRl. The 

45 transformation experiments were carried out in a P3 physical containment facility. (National Institutes of 
Health, U.S.A., Recombinant DNA Research Guidelines, 1976). Transformants were selected on minima! 
medium plates containing 20 pg/ml Ap and 40 [ig/ml X-gai, Ten transformants, which were all sensitive to 
Tc, were isolated. For reference these were designated pSOMI, pS0M2, etc. , .pSOMIO. In the control 
experiment no transformants were obtained. Four out of the ten transformants contained plasmids with 

50 both an £coRI site and BamH\ site. The size of the small £coRI, BamH\ fragment of these recombinant 
plasmids was in all four instances similar to the size of the m vitro prepared somatostatin DNA. Base 
sequence analysis according to Maxam and Gilbert Proc. Nat Acad. Sci. U.S.A. 74^ 560 (1977), revealed that 
the plasmid pSOMI had the desired somatostatin DNA fragment inserted. 

The DNA sequence analysis of the clone carrying plasmid pSOMI predicts that it should produce a 

55 peptide comprising somatostatin. However no somatostatin radioimmune activity has been detected in 
extracts of cell pellets or culture supernatants, nor is the presence of somatostatin detected when the 
growing culture is added directly to 70% formic acid and cyanogen bromide. £ coil. RRl extracts have been 
observed to degrade exogenous somatostatin very rapidly. The absence of somatostatin activity in clones 
carrying piasmid pSOMI could well result from intracellular degradation by endogenous proteolytic 

60 enzymes. Plasmid pSOMI was accordingly employed to construct a plasmid coding for a precursor protein 
comprising somatostatin and sufficiently large as to be expected to resist proteolytic degradation. 

E. The construction of plasmids pSOMII and pS0M11-3 

A plasmid was constructed in which the somatostatin gene could be located at the C-terminus of the 
65 p-galactosidase gene, keeping the translation in phase. The presence of an £coRl site near the C-terminus 
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of this gene and the available amino acid sequence of this protein {B. Polisky et al, Proc. Nat Acad, Sci. 
U.S.A. 73, 3900 (1976), A. V. Fowler et al. Id. at 74, 1507 (1976), A. I. Bukhari et al. Nature New Biology 243, 
238 (1973) and K. E. Langley, J. Biol. Cham. 250, 2587 (1975)) permitted insertion of the EcoRI BamH\ 
somatostatin gene into the £coRI site while maintaining the proper reading frame. For the construction of 

5 such a piasmid, pSOMI DNA (50 [ig) was digested with the restriction enzymes EcoRI and Pst\ in a final 
volume of 100 jil. A preparative 5% polyacrylamide gel was used to separate the large Pst-EcoR\ fragment 
that carries the somatostatin gene from the small fragment carrying the lac control elements. The large 
band was excised from the gel and the DNA eluted by slicing the gel into small pieces and extracting the 
DNA at 65°C overnight. In a similar way plasmid pBR322 DNA (50 ^ig) was digested with Psi\ and EcoRI 

10 restriction endonucleases and the two resulting DNA fragments purified by preparative electrophoresis on 
a 5% polyacrylamide gel. The small Pst\-EcoR\ fragment from pBR322 (1 fig) was llgated with the large 
Pst\'EcoB\ DNA fragment (5 jig) from pSOMI in a final volume of 50 lil with 1 unit of T4 DNA ligase at 12''C 
for 12 hrs. The ligated mixture was used to transform £ coli RR1, and transformants were selected for 
ampiciilin resistance on X-gal medium. As expected, almost all the Ap*" transformants (95%) gave white 

75 colonies (no !ac operator) on X-gal indicator plates. The resulting plasmid, pSOMII, was used in the 
construction of plasmid pS0M11-3. A mixture of 5 jig of pSOMII DNA and 5 jig of Xplac5 DNA was 
digested with EcoRI (10 units for 30 min at 37*'C). The restriction endonuclease digestion was terminated by 
phenol-chloroform extraction. The DNA was then ethanol-precipitated and resuspended in T4 DNA ligase 
buffer (50 \x\). T4 DNA ligase (1 unit) was added to the mixture and incubated at 12°C for 12 hrs. The ligated 

20 mixture was dialyzed against 10 mM Tris-HCI pH 7.6 and used to transform E Coli strain RR1. 
Transformants were selected for Ap^ on X-gal plates containing ampiciilin and screened for constitutive 
P-galactosidase production. Approximately 2% of the colonies were blue (pS0M11-1, 11-2 etc.). Restriction 
enzyme analysis of plasmid DNA obtained from these clones revealed that all the plasmids carried a new 
EcoRI fragment of approximately 4.4 megadaltons, which carries the lac operon control sites and most of 
25 the p-galactosidase gene. Because two orientations of the Eco-RI fragment are possible, the asymmetric 
location of a Hind\\\ restriction site was used to determine which of these colonies were carrying this EcoRI 
fragment with lac transcription proceeding into the somatostatin gene. Hfnd\\\-BamH\ double digestions 
indicated that only the clones carrying plasmids pS0M11-3, pSOI\/I11-5, pS0M11-6 and pS0M11-7 
contained the EcoRI fragment in this orientation. 

4. Radioimmune assay for somatostation activity 

The standard radioimmune assays (RIA) for somatostatin (A. Arimura et al, Proc. Soc. Exp. Biol. Med. 
748, 784 (1975)) were modified by decreasing the assay volume and using phosphate buffer. Tyr^^ 
somatostatin was iodinated using a chloramine T procedure, (id.) To assay for somatostatin, the sample, 
usually in 70% formic acid containing 5 mg/ml of cyanogen bromide was dried in a conical polypropylene 
tube (0.7 ml Sarstedt) over moist KOH under vacuum. Twenty microliters of PBSA buffer (75 mM NaCI; 75 
mM sodium phosphate, pH 7.2; 1 mg/ml bovine serum albumin; and 0.2 mg/ml sodium azide) was added, 
followed by 40 \x\ of a [^^^1] somatostatin "codctail" and 20 |il of a 1,000-fold dilution in PBSA of rabbit 
antisomatostatin immune serum S39 (Vale et al, Metabolism 25, 1491 (1976). The [^^^1] somatostatin 
cocktail contained per ml* of PBSA buffer; 250 |ig normal rabbit gamma globulin (Antibodies, Inc.), 1500 
units protease inhibitor ("Trasyiol", Calbiochem) and about 100,000 counts of [^^^1] Tyr^^ -somatostatin. 
After at least 16 hour at room temperature, 0.333 ml of goat anti-rabbit gamma globulin (Antibodies, Inc., 
P=0.03) in PBSA buffer was added to the sample tubes. The mixture was incubated 2 hr at 37°C, cooled to 
5°C, then c'entrifuged at 10,000xg for 5 min. The supernatant was removed and the pellet counted in a 
gamma counter. With the amount of antiserum used, 20% of the counts was precipitated with no unlabeled 
competing somatostatin. The background with infinite somatostatin (200 ng) was usually 3%. One-half 
maximum competition was obtained with 10 pg of somatostatin. Initial experiments with extracts of E Co// 
strain RR1 (the recipient strain) indicated that less than 10 pg of somatostatin could easily be detected in the 
presence of 16 pg or more of cyanogen bromide-treated bacterial protein. More than 2 \xg of protein from 
formic acid-treated bacterial extracts interfered somewhat by increasing the background, but cyanogen 
bromide cleavage greatly reduced this interference. Reconstruction experiments showed that somatotatin 
is stable in cyanogen bromide-treated extracts. 

A. Competition by bacterial extracts 

Strains £ Co// RR1 (pS0M11-5) and E Co// RR1 (pS0M11-4) were grown at 37°C to 5x10^ cells/ml in 
55 Luria broth. Then IPTG (isopropyl p-D-galactosidase, an inducer of the lac operon was added to 1 mM and 
growth continued for 2 hr. One-milliliter aliquots were centrifuged for a few seconds in an Eppendorf 
centrifuge and the pellets were suspended in 500 pi of 70% formic acid containing 5 mg/ml cyanogen 
bromide. After approximately 24 hr at room temperature, aliguots were diluted tenfold in water and the 
volumes Indicated in Figure 6 were assayed in triplicate for somatostatin. In Figure 6 "B/Bo" is the ratio of 
P^^l] somatostatin bound in the presence of sample to that bound in the absence of competing 
somatostatin. Each point is the average of triplicate tubes. The protein content of the undiluted samples 
was determined to be 2.2 mg/ml for E Coll RR1 (pS0M11-5) and 1,5 mg/ml for E Coli RR1 (pS0M11-4). 
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B. The initial screening of pS0M11 clones for somatostatin 
55 Cyanogen bromide-treated extracts of 11 clones (pS0M11-2, pS0M11-3, etc.) were made as described 
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above for the case of Figure 6. Thirty microliters of each extract was taken in triplicate for radioinnmune 
assay whose results appear from Figure 7, The range of assay points is indicated. The values for picograms 
somatostatin were read from a standard curve obtained as part of the same experiment. 

The radioimmune assay as appears from results described thus far may be summarized as follows. In 
contrast to the results of experiments with pSOMI, four clones (pS0M11-3, 11-5, 11-6, and 11-7) were 
found to have easily detectable somatostatin radioimmune activity Figures 6 and 7. Restriction fragment 
analysis revealed that pS0M11-3, pS0M11-5, pSOMII-S and pS0M11-7 had the desired orientation of the 
lac operon whereas pS0M11-2 and 11-4 had the opposite orientation. Thus there is a perfect correlation 
between the correct orientation of the !ac operon and the production of somatostatin radioimmune activity. 

C. Effects of IPTG induction and CNBr cleavage on positive and negative clones 

The design of the somatostatin plasmid predicts that the synthesis of somatostatin would be under the 
control of the lac operon. The lac repressor gene is not included in the piasmid and the recipient strain [E, 
Gofi RR1) contains the wild type chromosomal lac repressor gene which produces only 10 to 20 repressor 
molecules per cell. The plasmid copy number (and therefore the number of iac operators) is approximately 
20—30 per cell, so complete repression is impossible. As shown in Table 111, infra the specific activity of 
somatostatin in £ co// RR1 (pS0IV!11-3) was increased by iPTG, an inducer of the lac operon. As expected, 
the level of induction was low, varying from 2.4 to 7 fold. In experiment 7 (Table 111) a activity, a measure of 
the first 92 amino acids of p-galactosidase, also was induced by a factor of two. In several experiments no 
detectable somatostatin radioimmune activity can be detected prior to cyanogen bromide cleavage of the 
total cellular protein. Since the antiserum used in the radioimmune assay, S 39, requires a free N-termmal 
alanine, no activity was expected prior to cyanogen bromide cleavage. 



Table III 

25 Somatostatin radioimmune specific activity 

Abbreviations: Luria Broth, LB; isopropyithiogalactoside, IPTG, cyanogen bromide, CNBr; 
somatostatin, SS. Protein was measured by the method of Bradford, Anal Biochem, 72, 248 (1976). 



30 


Experiment 
Number 


Strain 


Medium 


IPTG 
1 mM 


CNBr 
5 mg/ml 


pg SS/iig 
protein 




1 


11-2 


LB 


+ 




<0.1 


35 




11-3 


LB 






12 






11-4 


LB 


+ 


4- 


<0.4 






11-5 


LB 




-i- 


15. 


40 


2 


11-3 


LB 


-1- 


-h 


12 






11-3 


LB 


+ 


4 


<0.1 


45 


3 


11-3 


LB 






61 






11-3 


LB 






8 






11-3 


LB 


+ 




<0.1 


50 


4 


11-3 


LB 


-t- 


-t- 


71 






11-3 


VB-f-glycerol* 


+ 


-h 


62 


55 


5 


11-3 


LB+glycerol 


-h 




250 




6 


11-3 


LB 






320 






11-2 


LB 


+ 


-h 


<0.1 


60 


7 


11-3 


LB 




-i- 


24 




• 


11-3 


LB 






10 


65 


*Vogel-Bonner minimal 


medium 


plus glycerol. 
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D. Gel filtration of cyanogen bromide— treated extracts 

Formic acid and cyanogen-treated extracts of the positive clones (pS0M11-3, 11-5, 11-6, and 11-7) were 
pooled (Total volume 250 |4l), dried, and resuspending in 0.1 ml of 50% acetic acid. [^H] leucine was added 
and the sample was applied to an 0.7x47 cm column of Sephadex G-50 in 50% acetic acid. Fifty-microliter 

5 aliquots of the column fractions were assayed for somatostatin. Pooled negative clone extracts (11-2, 11-4, 
and 11-1 T) were treated identically. The results appear from Figure 8. On the same column known 
somatostatin (Beckman Corp.) eiutes as indicated (SS). In this system, somatostatin is well-separated from 
excluded large peptides and fully included small molecules. Only extracts of clones positive for 
somatostatin exhibited radioimmune activity in the column fractions and this activity eiutes in the same 

JO position as chemically synthesized somatostatin. 

Summary of activity information 

The data establishing the synthesis of a polypeptide containing the somatostatin amino acid sequence 
are summarized as follows: (1) Somatostatin radioimmune activity is present in E co/f cells having the 

15 plasmid pS0M11-3, which contains a somatostatin gene of proven correct sequence and has the correct 
orientation of the lac EcoJRl DNA fragment. Ceils with the related plasmid pS0M11-2, which has the same 
somatostatin gene but an opposite orientation of the lac EcoRI fragment, produce no detectable 
somatostatin activity; (2) As predicted by the design scheme, no detectable somatostatin radioimmune 
activity is observed until after cyanogen bromide treatment of the cell extract; (3) The somatostatin activity 

20 is under control of the lac operon as evidenced by inducation by IPTG, an inducer of the lac operon; (4) The 
somatostatin activity co-chromatographs with known somatostatin on Sephadex G-50; (5) The DNA 
sequence of the cloned somatostatin gene is correct if translation is out of phase, a peptide will be made 
which is different from somatostatin at every position. Radioimmune activity is detected indicating that a 
peptide closely related to somatostatin is made, and translation must be in phase. Since translation occurs 

25 in phase, the genetic code dictates that a peptide with the exact sequence of somatostatin is made; (6) 
Finally, the above samples of E co/i RR1 (pSOM1 1-3) extract inhibit the release of growth hormone from rat 
pituitary cells, whereas samples of E co/f RR1 (pS0M11-2) prepared in parallel and with identical protein 
concentration have no effect on growth hormone release. 

30 Stability, yield, and purification of somatostatin 

The strains carrying the EcoR\ lac operon fragmerit (pS0M11-2, pSOM1 1-3, etc.) segregate with respect 
to the plasmid phenotype. For example, after about 15 generations, about one-half of the E co/I RR1 
(pS0M11-3) culture was constitutive for p-galactosidase, i.e., carried the lac operator, and of these about 
half were amptcillin resistant. Strains positive (pS0M11-3) and negative {pS0M11-2) for somatostatin are 

35 unstable, and therefore, thue growth disadvantage presumably comes from the overproduction of the 
large but incomplete and inactive galactosidase. The yield of somatostatin has varied from 0.001 to 0.03% 
of the total cellular protein (Table 1) probably as the result of the selection for cells in culture having 
plasmids with a deleted lac region. The highest yields of somatostatin have been from preparations where 
growth was started from a single ampicillin resistant, constitutive colony. Even in these cases, 30% of the 

40 cells at harvest had deletions of the lac region. Storage in the frozen state (lyophilization) and growth to 
harvest from a single such colony Is accordingly indicated for the system described. Yields may be 
increased by, e.g., resort in bacterial strains which overproduce lac repressor such that expression of 
precursor protein is essentially totally repressed prior to induction and harvest. Alternatively, as previously 
discussed, a tryptophan or other operator- promoter system which ordinarily is totally repressed may be 

45 employed. 

In the crude extract resulting from cell disruption in, e.g., an Eaton Press, the (3-galactosidase- 
somatostatin precursor protein is insoluble and is found in the first low speed centrifugation pellet. The 
activity can be solubilized in 70% formic acid, 6N quanidinium hydrochloride, or 2% sodium dodecyl 
sulfate. Most preferably, however, the crude extract from the Eaton Press is extracted with 8N urea and the 
50 residue cleaved with cyanogen bromide. In initial experiments somatostatin activity derived from £. co/i 
strain RR1 (pS0M11-3) has been enriched approximately 100-fold by alcohol extraction of the cleavage 
product and chromatography on Sephadex G-50 in 50% acetic acid. When the product is again 
chromatographed on Sephadex G-50 and then subjected to high pressure liquid chromatography, 
substantially pure somatostatin may be obtained, 

55 

II. Human insulin 

The techniques previously described were next applied to the production of human insulin. Thus, the 
genes for insulin B chain (104 base pairs) and for insulin A chain (77 base pairs) were designed from the 
amino acid sequence of the human polypeptides, each with single-stranded cohesive termini for the EcoRI 

60 and BamHl restriction endonucleases and each designed for insertion separately into pBR322 plasmids. 
The synthetic fragments, deca- to pentadeca-nucleotides, were synthesized by the block phosphotriester 
method using trinucleotides as building blocks and ultimately purified with high performance liquid 
chromatography (HPLC). The human insulin A and B chain synthetic genes were then cloned separately in 
plasmid pBR322. The cloned synthetic genes were fused to an E Co// (3-galactosidase gene as before to 

65 provide efficient transcription, translation, and a stable precursor protein. Insulin peptides were cleaved 
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from P-gaiactosidase precursor, detected by radioimmunoassay, and purified. Insulin radioimmunoassay 
activity was then generated by mixing the £ Coli products. 

1, Design and synthesis of human insulin genes 

5 The genes constructed for human insulin are depicted in Figure 9. The genes for human insulin, B chain 
and A chain, were designed from the amino acid sequences of the human polypeptides. The 5' ends of each 
gene have single stranded cohesive termini for the EcoH\ and BamH\ restriction endonucleases, for the 
correct insertion of each gene into plasmid pBR322. A Hmd\\\ endonuciease recognition site was 
incorporated into the middle of the B chain gene for the amino acid sequence GIu-Ala to allow amplification 

10 and verification of each half of the gene separately before the construction of the whole B chain gene. The B 
chain and the A chain genes were designed to be built from 29 different oligodeoxyribonucleotides, var/ing 
from decamer to pentadecamers. Each arrow indicates the fragment synthesized by the improved 
phosphotriester method, HI to H8 and B1 to B12 for the 8 chain gene and A1 to A1 1 for the A chain gene. 

15 2. Chemical synthesis of oligodeoxyribonuceotides 

Materials and methods for synthesis of oiigodeoxyribonucleotides were essentially those described in 
Itakura, K. et al (1975) J. Biol. Chem. 250, 4592 and Itakura, K. et al (1975) a, Amer. Cbem. Soc. 97, 7327 

except for these modifications. 

a) The fully protected mononucleotides, S'-O-dimethoxytrityi-S'-p-chlorophenyi-p-cyanoethyl 
20 phosphates, were synthesized from the nucleoside derivatives using the monofunctional phosphorylating 
agent p-chlorophenyl-p-cyanoethyl phosphorochloridate (1.5 molar equivalent) in acetonitrile in the 
presence of 1-methyl imidazole Van Boom, J, H. et al (1975) Tetrahedron '31, 2953. The products were 
isolated in large scale (100 to 300 g) by preparative liquid chromatography (Prep 500 LC, Waters 
Associates). 

25 b) By using the solvent extraction method [Hirose, T. et al (1978) Tetrahedron Letters, 2449] 32 
bifunctional trimers were synthesized (see Table IV) in 5 to 10 mmole scale, and 13 trimers, 3 tetramers, and 
4 dimers as the 3' terminus blocks, in 1 mmole scale. The homogeneity of the fully protected trimers was 
checked by thin layer chromatography on silica gel in two methanol/chioroform solvent systems: solvent a, 
5% v/v and solvent, b, 10% v/v (See Table IV). Starting from this library of compounds, 29 

30 oiigodeoxyribonucleotides of defined sequence were synthesized, 18 for the B chain and 11 for the A chain 

gene. 

The basic units used to construct polynucleotides were two types of trimer block, i.e. the bifunctional 
trimer blocks of Table IV and corresponding 3'-terminus trimers protected by an anisoyi group at 
3'-hydroxy. The bifunctional trimer was hydrolyzed to the corresponding 3'-phosphodiester component 

35 with a mixture of pyridine-triethylamine-water (3:1:1 v/v) and also to the corresponding S'-hydroxyl 
component with 2% benzene-sulfonic acid. The 3'-terminus block previously referred to was treated with 
2% benzenesulfonic acid to give the corresponding 5'-hydroxyl. The coupling reaction of an excess of the 
3'-phosphodiester trimer (1.5 molar equivalent) with the 5'-hydroxyl component, however obtained, (1 
molar equivalent) in the presence of 2,4,6-triisopropylbenzenesulfonyl tetrazolide (TPSTe, 3 to 4 

40 equivalents) went almost to completion. To remove the excess of the 3'-phosphodiester block reactant the 
reaction mixture was passed through a short silica gel column set up on a sintered glass filter. The column 
was washed, first with CHCI3 to elute some side products and the coupling reagent, and then with 
CHClgiMeOH (95:5 v/v) in which almost all of the fully protected oligomer was eluted. Under those 
conditions, the charged 3'-phosphodiester block reactant remained in the column. Similarly, block 

45 couplings were repeated until the desired length was constructed. 

TABLE IV 
Synthesis of trimer building blocks 

50 



55 



60 



65 



No 


Compound* 


Yield** 
(%) 


Rf 
a. 


b. 


Purity*** 
(%) 


In Figure 9, 
present in: 


1. 


AAG 


47 


0.15 


0.40 


93 


B5,B6 


2. 


AAT 


49 


0.25 


0.52 


95 


H1,A1,A6 


3. 


AAC 


52 


0.28 


0.55 


93 


H5,B6,A2,A8 


4. 


ACT 


43 


0.27 


0.53 


91 


B4,B5,A6 


5. 


ACC 


56 


0.33 


0.60 


96 


B7 


6. 


ACG 


39 


0.18 


0.45 


90 


H5,B7 


7. 


AGG 


45 


0.10 


0.26 


89 


H6,H739 
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TABLE IV (contd.) _ 
Synthesis of trimer building blocks 



5 


No 


Compound* 


Yieid** 
(%) 


a. 


Rf 

b. 


Punty*** 
(%) 


In Figure 9, 
present In: 




8. 


AGT 


33 


0.14 


0.40 


96 


B9,A2,A1 1 




9. 


AGC 


50 


0.19 


0.48 


92 


H8,B1,A5,A10 


10 


10. 


AGA 


48 


0.24 


0.50 


91 


A9, 




11. 


TTC 


^^^^ 


0.26 


0.52 


95 


B4,B7,A3 


15 


12. 


TTC 


49 


0.11 


0.31 


94 


H3,H5,A2,A3,A5 




13. 


TCT 


58 


0,24 


0.49 


96 


A4 




14. 


TCA 


45 


0.28 


0.53 


92 


H1,H2,H4,A1 


20 


15. 


TCG 


39 


0.12 


0.34 


91 


A2 




16. 


TGG 


32 


0.10 


0.28 


87 


H3,A1A10 


25 


17. 


TGC 


51 


0.18 


0.47 


93 


H6,B2,A4,A7,A8 




18. 


TGA 


46 


0.12 


0.37 


94 


H7 




19. 


TAG 


61 


0.22 


0,50 


90 


B4,A1 1 


30 


20. 


TAA 


55 


0.17 


* 

0.44 


95 


B5,A10 




21. 


CCT- 


53 


0,30 


0.55 


97 


H3,H4,B10 


35 


22. 


CAC 


47 


0,25 


0.51 


92 


A3 




23. 


CAA 


58 


0.25 


0.51 


93 


H2,H6,H8,A7 




24. 


CTT 


41 


0.28 


0.54 


92 


B2,B9,A4 


40 


25. 


CGA 


40 


0.27 


0.52 


93 


A7 




26. 


CGT 


75 


0.25 


0.50 


89 


H2,H4,B3,B1 


45 


27. 


GGT 


35 


0.09 


0.26 


90 


B3 




28. 


GTT 


46 


0.18 


0.45 


93 


B2 




29. 


GTA 


38 


0.25 


0.50 


95 


B6,B8,A6 


50 


30. 


GAA 


39 


0.15 


0.39 


88 


H7,B3.B8,A5 




31. 


GAT 


52 


0.22 


0,49 


89 


B10,A9 


55 


32. 


GCA 


42 


0.14 


0,39 


93 


A9 



* Fully protected trideoxynucleotides: 5-0-DimethoxYtrityi-3'-p-Chlorophenyl-p-cyanoethytphosphate. 
** Yield was the overall yield calculated from the 5'-hydroxylmonomers. 
QQ *** Based on HPLC analysis. 

High performance liquid chromatography (HPLC) was used extensively during oligonucleotide 
synthesis for a) analysis of each trimer and tetramer block, b) analysis of the intermediate fragments 
(hexamers, monomers and decamers), c) analysis of the last coupling reaction, and d) purification of the 
55 final products. The HPLC was performed by using a Spectra-Physics 3500B liquid chromatograph. After 
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removal of all protecting groups by cone. NH4OH at 50°C (6h) and 80% AcOH at room temperature (15 min), 
the compounds were analyzed on a Permaphase AAX (DuPont) [Van Boom, J, et al (1977) J. 
Chromatography 131, 169.] column (1 mX2mm), using a linear gradient of solvent B (0,05M KH2PO4— 1.0M 
KCI, pH 4.5) in solvent A (0.01 M KH2PO4; pH 4.5). The gradient was formed starting with buffer A and 
applying 3% of buffer B per minute. The elution was performed at 60°C, with a flow rate of 2 ml per minute. 
The purification of the 29 final oligonucleotides also was performed on Permaphase AAX, under the same 
conditions reported above. The desired peak was pooled, desalted by dialysis, and lyophilized. After 
labeling the 5' termini with (Y-^^p)ATP using T4 polynucleotide kinase, the homogeneity of each 
oligonucleotide was checked by electrophoresis on a 20% polyacrylamide gel. 

3. Assembly and cloning of B chain gene and the A chain gene 

The gene for the B chain of insulin was designed to have an EcoR\ restriction site on the left end, a 
Hind\\\ site in the middle and BamH\ site at the right end. This was done so that both halves, the left 
Eco^\'Hind\\\ ha(t(BH) aiid the right A///7cyiIi-5a/77HI half (BB), could be separately cloned in the convenient 
cloning vehicle pBR322 and after their sequences had been verified, joined to give the complete B gene 
(Figure 10). The BB half was assembled by ligation from 10 oligodeoxyribonucieotldes, labeled B1 to BIO in 
Figure 9, made by phosphotriester chemical synthesis. B1 and BIO were not phosphorylated, thereby 
eliminating unwanted polymerization of these fragments through their cohesive ends (HindWX and BamH\). 
After purification by preparative acrylamide gel electrophoresis and elution of the largest DNA band, the BB 
fragment was inserted into plasm id pBR322 which has been cleaved with Hind\\\ and BamH\. About 50% of 
the ampicillin resistant colonies derived from the DNA were sensitive to tetracycline, indicating that a 
nonplasmid Hind\\\-BamH\ fragment had been inserted. The small Hind\\\'BamH\ fragments from four of 
these colonies (pBBIOI to pBB104) were sequenced and found to be correct as designed. 

The BB fragment was prepared in a similar manner and inserted into pBR322 which had been cleaved 
with EcoB\ and HindWX restriction endonucleases. Plasmids from three amplicillin resistant, tetracycline- 
sensitive transformants (pBHI to pBH3) were analyzed. The small Econ\'Hind\\\ fragments were found to 
have the expected nucleotide sequence. 

The A chain gene was assembled in three parts. The left four, middle four, and right four 
oligonucleotides (see Figure 9) were ligated separately, then mixed and ligated (oligonucleotides Al and 
A12 were unphosphorylated). The assembled A chain gene was phosphorylated, purified by gel 
electrophoresis, and cloned in pBR322 at the EcoR\-Bam\\\ sites. The EcoR\'BamH\ fragments from two 
ampicillin resistant, tetracycline sensitive clones (pAlO, pA11) contained the desired A gene sequence. 

4. Construction of plasmids for expression of A and B insulin genes 

Figure 10 illustrates the construction of the lac-insulin B plasmid (piBI). Plasmids pBHI and pBBIOI 
were digested with EcoB\ and Hind\\\ endonucleases. The small BH fragment of pBHI and the large 
fragment of pBBIOI (containing the BB fragment and most of pBR322) were purified by gel electrophoresis, 
mixed, and ligated in the presence of EcoRl-cleaved XplacS. The megadalton £coRI fragment of Aplac5 
contains the lac control region and the majority of the 3-galactosidase structural gene. The configuration of 
the restriction sites ensures correct joining of BH to BB. The lac EcoB\ fragment can insert in two 
orientations; thus, only half of the clones obtained after transformation should have the desired 
orientation. The orientation of ten ampicillin resistant, p-gaiactosidase constitutive . ones were checked by 
restriction analysis. Five of these colonies contained the entire B gene sequence and the correct reading 
frame from the p-galactosidase gene into the B chain gene. One, plBI, was chosen for subsequent 
experiments. 

In a similar experiment, the 4.4 megadalton lac fragment from \plac5 was introduced into the pA11 
plasmid at the EcoRX site to give plAI. plAI is identical to pIBI except that the A gene fragment is 
substituted for the B gene fragment DNA sequence analysis demonstrated that the correct A and chain 
gene sequences were retained in plAI and plBI respectively. 

5. Expression 

The strains which contain the insulin genes correctly attached to p-galactosidase both produce large 
quantities of a protein the size of p-galactosidase. Approximately 20% of the total cellular protein was this 
P-galactosidase-insulin A or B chain hybrid. The hybrid proteins are insoluble and were found in the first 
low speed pellet where they constitute about 50% of the protein. 

To detect the expression of the insulin A and B chains, we used a radioimmunoassay (RIA) based on 
the reconstitution of complete insulin from the separate chains. The insulin reconstitution procedure of 
Katsoyannis et al (1967) Biochemistry ft 2642—2655, adapted to a 27-microliter assay volume, provides a 
very suitable assay. Easily detectable insulin activity is obtained after mixing and reconstituting 
S-Sulfonated derivatives of the insulin chains. The separate S-suifonated chains of insulin do not react 
significantly, after reduction and oxidation, with the anti-insulin antibody used. 

To use the reconstitution assay we partially purified the P-gaiactosidase-A or B chain hybrid protein, 
cleaved with cyanogen bromide, and formed S-sulfonated derivatives. 

The evidence that we have obtained correct expression from chemically synthesized genes for human 
insulin can be summarized as follows: a) Radioimmune activity has been detected for both chains, b] The 
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DNA sequences obtained after cloning and ptasmid construction liave been directly verified to be correct as 
designed. Since radioimmune activity is obtained, translation must be in phase. Therefore, the genetic 
code dictates that peptides with the sequences of human insulin are being produced, c) The E. coU 
products, after cyanogen bromide cleavage, behave as insulin chains in three different chromatographic 
5 systems which separate on different principles (gel filtration, ion exchange, and reversed phase HPLC), d) 
The E coli produced A chain has been purified on a small scale by HPLC and has the correct amino acid 
composition. 

Claims 

70 

1. A recombinant piasmid suited for transformation of a bacterial host wherein the plasmid comprises 
a homologous regulon, heterologous DNA, and one or more termination codon(s), the heterologous DNA 
encoding a desired functional heterologous polypeptide or intermediate therefor which is not degraded by 
endogenous proteolytic enzymes, said DNA being positioned in proper reading frame with said 

15 homologous regulon between said regulon and the termination codon(s), whereby on translation of the 
transcription product of the heterologous DNA in a suitable bacterium, the resulting expression product is 
sraid desired functional polypeptide or intermediate therefor in recoverable form. 

2. A recombinant plasmid according to claim 1, wherein the regulon is essentially identical to a regulon 
ordinarily present in the chromosomal DNA of the bacterial host. 

20 3. A recombinant plasmid according to claim 1 or 2, wherein the said heterologous DNA comprises 
cDNA. 

4. A recombinant plasmid according to claim 1 or 2, wherein the said heterologous DNA comprises 
organic synthesis-derived DNA. 

5. A recombinant plasmid according to any one of the preceding claims, wherein the plasmid is a 
25 bacterial plasmid. 

6. A recombinant plasmid according to any one of the preceding claims, wherein the regulon 
comprises the Escherfchia coli lac or the Escherichia coli tryptophan promoter-operator system. 

7. A recombinant plasmid according to any one of the preceding claims, wherein the heterologous 
DNA codes for a functional polypeptide as such and the heterologous DNA is immediately preceded by a 

30 translational start codon. 

8. A recombinant plasmid according to any one of claims 1 to 6, wherein the heterologous polypeptide 
is a mammalian hormone or an intermediate therefor. 

9. A process for the productron of a recombinant plasmid as defined in any one of the preceding claims 
which comprises treating a length of double stranded DNA comprising an intact replicon and in sequence, 

35 (a) a regulon for controlling transcription and translation in a bacterial host and (b) a restriction 
endonuclease recognition site, with a suitable restriction endonuclease to form a DNA fragment that 
comprises the replicon and the regulon, and ligating thereto in proper reading frame with said regulon 
heterologous DNA encoding a functional heterologous polypeptide or Intermediate therefor which is not 
degraded by endogenous proteolytic enzymes, said heterologous DNA having a terminal nucleotide 

40 grouping which is ligatable to said DNA fragment, to give said recombinant plasmid. 

10. A bacterium transformed with a recombinant plasmid according to any one of claims 1 to 8. 

11. A bacterial culture comprising transformed bacteria according to claim 10. 

12. A process for the bacterial production of a functional heterologous polypeptide or intermediate 
therefor comprising growing a bacterial culture as defined in claim 11 to bring about expression of said 

45 polypeptide or intermediate in recoverable form and recovering said polypeptide or intermediate. 

13. A process according to claim 12 for producing an immunogenic substance comprising a 
polypeptide hapten, comprising: 

(a) providing a recombinant plasmid containing a homologous regulon, and in proper reading frame 
therewith, a heterologous DNA sequence encoding the hapten, a DNA sequence encoding a second amino 

50 acid sequence sufficient in size to render the product of DNA expression immunogenic and one or more 
termination condons; 

(b) growing a bacterium transformed with the recombinant plasmid, occasioning expression of a 
conjugate polypeptide consisting essentially of the amino acid sequence of the hapten and the second 
amino acid sequence; and 

55 (c) testing the conjugate polypeptide for its ability to raise antibodies against said hapten. 

14. A process according to claim 12 for producing an immunogenic substance comprising 
somatostatin, comprising: 

(a) providing a recombinant plasmid containing a homologous regulon and in proper reading frame 
therewith a heterologous DNA sequence encoding the somatostatin, and a DNA sequence encoding a 

60 second amino acid sequence sufficient in size to render the product of DNA expression immunogenic and 
one or more termination codons; and 

(b) in a bacterium transformed with the recombinant piasmid, occasioning expression of a conjugate 
polypeptide consisting essentially of the somatostatin and the second amino acid sequence. 

15. A process according to claim 13 or 14, wherein the expression product comprises in excess of about 
65 ICQ amino acids. 
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16. A process according to claim 13 or 14, wherein the expression product comprises in excess of about 
200 amino acids. 

Patentanspruche 

1. Ein fur die Transformation eines Bakterienwirts geeignetes Rekombinationspiasmid, wobei das 
Plasmid ein homologes Reguion, eine heterologe DNA und ein oder mehrere Terminationscodon{s) 
umfalSt, die heterologe DNA fur das gewiinschte funktionelle heterologe Poiypeptid Oder sein 
Zwischenprodukt encodiert, welches durch endogene proteolytische Enzyme nicht abgebaut wird, wobei 
die DNA in geelgnetem Leseraster mit dem homologen Regulon zwischen dem Regulon und dem (den) 
Terminationscodon{s) lokalisiert ist, wobei bei der Translation des Transkriptionsproduktes der 
heterologen DNA in einem geeigneten Bakterium das entstehende Expressionsprodukt das gewunschte 
funktionelle Poiypeptid oder sein Zwischenprodukt in isoiierbarer Form ist. 

2. Rekombinationsplasmid nach Anspruch 1, dadurch gekennzeichnet, dalS das Regulon im 
wesentlichen dem Regulon identisch ist, das normaierweise in der chromosomalen DNA des 

Bakterienwirts vorhanden ist * . pm^ia 

3. Rekombinationsplasmid nach Anspruch 1 oder 2, dadurch gekennzeichnet, daiS die heterologe DNA 

cDNA enthait. ^ ^ „ , r^K.A 

4. Rekombinationsplasmid nach Anspruch 1 oder 2, dadurch gekennzeichnet, da(5 die heterologe DNA 

DNA, die sich von einer organischen Synthese ableitet, enthait 

5. Rekombinationsplasmid nach einem der vorhergehenden Anspruche, dadurch gekennzeichnet, daQ> 

das Plasmid ein Bakterienplasmid ist 

6. Rekombinationsplasmid nach einem der vorhergehenden Anspruche, dadurch gekennzeichnet, dalS 
das Regulon Escherichia coli lac oder Escherichia coli tryptophan Promoter-Operator- System enthait 

7. Rekombinationsplasmid nach einem der vorhergehenden Anspruche, dadurch gekennzeichnet, dafS 
die heterologe DNA fur ein funktionelles Poiypeptid als solches codiert und dalS der heterologen DNA ein 

Translationsstartcodon vorausgeht. . o 

8. Rekombinationsplasmid nach einem der Anspruche 1 bis 6, dadurch gekennzeichnet, dafS das 
heterologe Poiypeptid ein Saugetierhormon oder ein Zwischenprodukt davon ist. 

9. Verfahren zur Herstellung eines Rekombinationsplasmids nach einem der vorhergehenden 
Anspruche, dadurch gekennzeichnet, daS man eine Lange einer doppelstrangigen DNA, die ein intaktes 
Replicon und in Sequenz (a) ein Regulon fur die Kontrolle der Transkription und Translation in einem 
Bakterienwirt und (b) eine Erkennungsstelle fur eine Restriktionsendonuclease mit einer geeigneten 
Restriktionsendonuciease unter Bildung eines DNA- Fragments, das das Replicon und das Regulon enthait 
und daran in geeignetem Leseraster mit dem Regulon heterologe DNA anhangt, die fur ein funktionelles 
heterologes Poiypeptid oder sein Zwischenprodukt encodiert, welches durch endogene proteolytische 
Enzyme nicht abgebaut wird, wobei die heterologe DNA eine terminate Nucieotidgrupplerung aufweist die 
an das DNA-Fragment angehangt werden kann, wobei man des Rekombinationsplasmid erhalt 

10. Bakterium, dadurch gekennzeichnet daiS es mit einem Rekombinationsplasmid nach einem der 

Anspruche 1 bis 8 transformiert ist 

11. Bakterienkuitur, dadurch gekennzeichnet dal5 sie transform ierte Baktenen nach Anspruch 10 

enthait. ^ , . . ^ 

12. Verfahren fur die bakterielle Herstellung eines funktionnellen heterologen Polypeptids oder sem 
Zwischenprodukt, dadurch gekennzeichnet dafS man die Bakterienkuitur nach Anspruch 1 1 zuchtet um 
eine Expression des Polypeptids oder seines Zwischenprodukts im isoiierbarer Form zu erhalten und das 
Poiypeptid oder sein Zwischenprodukt isoliert. 

13. Verfahren nach Anspruch 12 fiir die Herstellung einer immunogenen Substanz, die em 
Polypeptidhapten enthait dadurch gekennzeichnet, daft man 

(a) ein Rekombinationsplasmid, welches ein homologes Regulon und in geeignetem Leseraster demit 
eine heterologe DNA-Sequenz, die das Hapten encodiert, eine DNA-Sequenz, die eine zweite Aminosaure- 
Sequenz encodiert, die in ihrer GrofSe ausreicht um das Produkt der DNA-Expresston immunogen zu 
machen, und ein oder mehrere Terminationscodon(s) enthait zur Verfugung steilt; 

(b) ein mit dem Rekombinationsplasmid transformiertes Bakterium zuchtet wobei die Expression 
eines Konjugat-Polypeptids, welches im wesentlichen aus der Aminosaure-Sequenz des Haptens und der 
zweiten Aminosaure-Sequenz besteht erfolgt; und 

(c) das Konjugat-Polypeptid fur seine Fahigkeit Antikorper gegen Hapten zu erzeugen, testet 

14. Verfahren nach Anspruch 12 fur die Herstellung einer Immunogensubstanz, welche Somatostatin 

enthait dadurch gekennzeichnet dafS man 

(a) ein Rekombinationsplasmid, welches ein homologes Regulon und in geeignetem Leseraster damit 
eine heterologe DNA-Sequenz, die das Somatostatin encodiert, und eine DNA-Sequenz, welche eine zweite 
Aminosaure-Sequenz in ausretchender GrofSe encodiert, um das Produkt der DNA-Expression immunogen 
zu machen, und ein oder mehrere Terminationscodon(s) enthait zur Verfugung steilt; 

(b) ein mit dem Rekombinationsplasmid transformiertes Bakterium, das ein Konjugat-Polypeptid, 
welches im wesentlichen aus Somatostatin und der zweiten Aminosaure-Sequenz besteht exprimiert. 
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15. Verfahren nach Ansprgch 13 oder 14, dadurch gekennzeichnet dafS das Expressionsprodukt einen 
OberschulS von etwa 100 Aminosauren enthalt. 

16. Verfahren nach Anspruch 13 oder 14, dadurch gekennzeichnet dalS das Expressionsprodukt einen 
UberschulS von etwa 200 Aminosauren enthalt. 

5 

Revendications 

1. Plasmlde recombinant convenant pour la transformation d'un hote bacterien, ledit plasmide 
comprenant un regulon homologue, un ADN heterologue et un ou plusieurs codons de termination, I'ADN 

10 heterologue codant pour un polypeptide heterologue fonctionnel desire ou un ses intermediaires qui n'est 
pas degrade par des enzymes proteolytiques endogenes, ledit ADN etant place dans !e cadre de lecture 
convenabie, avec ledit regulon homologue entre ledit regulon et le ou les codons a terminaison, par 
traduction du produit de transcription de I'ADIM heterologue dans une bacterie convenabie, le produit 
d'expression resultant etant ainsi iedit polypeptide fonctionnel desire ou un de ses intermediaires, sous 

15 une forme permettant sa separation. 

2. Plasmide recombinant suivant !a revendication 1, dans lequel le regulon est pratiquement identique 
a un regulon habituellement present dans TADN chromosomique de I'hote bacterien. 

3. Plasmide recombinant suivant la revendication 1 ou 2, dans lequel I'ADN heterologue comprend de 
I'ADN c. 

20 4. Plasmide recombinant suivant la revendication 1 ou 2, dans iequel I'ADN heterologue comprend de 
I'ADN obtenu par synthese organique. 

5. Plasmide recombinant suivant Tune quelconque des revendications precedentes, ledit plasmide 
etant un plasmide bacterien. 

6. Plasmide recombinant suivant Tune quelconque des revendications precedentes, dans lequel le 
25 regulon comprend le systeme promoteur-operateur Escherichia coli lac ou bien le systeme promoteur- 

operateur Escherichia coli tryptophane. 

7. Plasmide recombinant suivant Tune quelconque des revendications precedentes, dans lequel I'ADN 
heterologue code pour un polypeptide fonctionnel en tant que tel et I'ADN heterologue est immediatement 
precede d'un codon d'initiation de traduction. 

30 8. Plasmide recombinant suivant I'une quelconque que des revendications 1 a 6, dans lequel le 
polypeptide heterologue est une hormone de mammifere ou un de ses intermediaires. 

9. Procede de production d'un plasmide recombinant suivant Tune quelconque des revendications 
precedentes, qui consiste a traiter une certalne longueur d'ADN bicatenaire comprenant un replicon intact 
et, successivement, (a) un regulon destine au controle de la transcription de la traduction dans un hote 

35 bacterien et (b) un site de reconnaissance d'endonuclease de restriction, avec une endonuclease de 
restriction convenabie pour former un fraction d'ADN qui comprend le replicon et le regulon, et at y 
effectuer une ligation dans un cadre de lecture convenabie entre ledit regulon et un ADN heterologue 
codant pour un polypeptide heterologue fonctionnel ou un de ses intermediaires qui n'est pas degrade par 
des enzymes proteolytiques endogenes, ledit ADN heterologue comprenant un groupement nucleotique 

40 terminal dont la ligation peut s'effectuer audit fragment d'ADN, pour donner ledit plasmide recombinant. 

10. Bacterie transformee au moyen d'un plasmide recombinant suivant I'une quelconque des 
revendications 1 a 8. 

11. Culture bacterienne comprenant des bacteries transformees suivant la revendication 10. 

12. Procede de production bacterienne d'un polypeptide heterologue fonctionnel ou d'un de ses 
45 intermediaires, consistent a faire croitre une culture bacterienne suivant la revendication 11 pour 

provoquer I'expression dudit polypeptide ou dudit intermediaire sous une forme permettant sa separation, 
et a recueiilir ledit polypeptide ou ledit Intermediaire. 

13. Procede suivant la revendication 12, pour la production d'une substance immunogene comprenant 
un haptene polypeptidique, consistent: 

50 (a) a produire un plasmide recombinant contenant un regulon homologue et, dans un cadre de lecture 
convenabie avec ceiui-ci, une sequence d'ADN heterologue codant pour I'haptene, une sequence d'ADN 
codant pour une seconde sequence d'amino-acides de dimensions suffisantes pour rendre immunogene le 
produit de I'expression de I'ADN, et un ou plusieure codons de terminaison; 

(b) a faire croitre une bacterie transformee au moyen du plasmide recombinant, provoquant 
55 I'expression d'un polypeptide conjugue comprenant essentiellement la sequence d'amino-acides de 

I'haptene et la seconde sequence d'amlnoacides; et 

(c) a tester I'aptitude du polypeptide conjugue a provoquer la production d'anticorps centre ledit 
haptene. 

14. Procede suivant ia revendication 12, pour la production d'une substance immunogene comprenant 
60 de la somatostatine, consistent: 

(a) a produire un plasmide recombinant contenant un region homologue et, dans un cadre de lecture 
convenabie avec celui-ci, une sequence d'ADN heterologue codant pour la somatostatine, et une sequence 
d'ADN codant pour une seconde sequence d'amino-acides de dimensions suffisantes pour rendre 
immunogene le produit de I'expression de I'ADN, et un ou plusieurs codons de terminaison; et 
65 (b) dans une bacterie transformee au moyen du plasmide recombinant, a provoquer I'expression d'un 
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polypeptide conjugue comprenant essentiellement la somatostatine et la seconde sequence d'amino- 

15. Procede suivant la revendication 13 ou 14, dans lequel le produit de I'expression comprend plus 
d'environ 100 amino-acides. 

16. Procede suivant la revendication 13 ou 14, dans lequei le produit de I'expression comprend plus 

d'environ 200 amino-acides. 
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Eco fli 



, 2 5 4 5 6 r a 9 10 H t2 15 14 '5 16 <7 16 19 20 21 22 23 24 25 26 2? 28 29 50 
V.r P« voi 4ln GIA H.. L.u C,l 31, S«f H,. L.« vol Glu A.a L.u Tyr L.u Vol C,. Glr G"« Arg PA. Ph. T,r T.r Pro L,. 



, . ■ ■ ■ * ..^^ . i f 'J 

AATTcaTGTTCGTCAATCAOCACCTTTGrGGTTCTCACCTCGTTOAAGCTTTGTACCTTGTTTGCGGTGAACGTGGTTTCTTCTACACTCCTAAGACTTAATAG 

oTftClAGCAGTTAGTCGTGGAAACACCAAGAGTGGAGCAACTTCGAAACATGGAACAAACGCCACTTGCACCAAAGAAGATGTGAGGATTCTGAATTATCCTAG 



_ _ -b. 

'» 



Mind M 



, 2 3 4 5 S 7 e 9 ro 11 12 13 14 '5 *6 1? 18 19 20 21 

Mat ti« vol G.J Gin >:»i C^i 'n/ S«r Cjl S»r ..eu T,r Gin Liu Giu Ai" W Cr» Af* 

£co Rt 

„ J *j ► ■ 4, 4,——- 4,— *t " 

AATTCATQGGCAICGTTGAACAGTGTTGCACTTCTATCTGCTCTCTTTACCAGCTTGAGAACTACTCTftACTAATAC 

'GTACCCGTAGCAACTTOTCACAACGTGAAGATAGACGACAGAAATGGTCGAACTCTTGATGAC ATTG AnAr CCTAG 
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